complex dynamic
- North America > Canada (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Review for NeurIPS paper: Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
Weaknesses: One of the main assumptions used in characterizing the "threshold states" at which the gradient flow dynamics appear to get trapped is that the Hessian is positive semidefinite. On the other hand, in figure 2 as the training loss crosses the threshold energy the minimal eigenvalues of the Hessian appear clearly negative, unless I am misunderstanding the figure. The authors do not appear to address this point. Could the soundness of this assumption also account for the inaccuracy of the computed value of alpha at which the phase transition occurs which can be seen in figure 4? The main result of the paper - the computation of the relative sample size alpha at which the phase transition occurs, doesn't seem to be very accurate when compared to experiments in figure 3 and 5. It would have also been helpful to plot this value in the figure in order to make this point clear. The discrepancy could be a result of finite size effects as the authors claim, but could also be a result of say the assumption made about the Hessian at the threshold states or the accuracy of the 1RSB ansatz.
Review for NeurIPS paper: Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
The paper makes interesting contributions towards understanding non-convex optimization by studying a problem that is simple enough to allow for analytical calculations. Overall, there is a decent, well-supported agreement between theory and experiment (in particular, between the leading moments of the distribution of the threshold states as evaluated empirically and the computed moments). This paper is a valuable contribution to NeurIPS and should be accepted. Overall, however, we recommend various lines along which the paper could improve further to reach a wider audience, and we recommend that the authors revisit the author feedback before they submit their final version. First, the paper presentation is somewhat unusually difficult to follow from the perspective of the machine learning audience and could be improved by providing more background on known results that were used in the paper (e.g., the BPP transition or replica theory), if necessary in the appendix.
Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
Despite the widespread use of gradient-based algorithms for optimising high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem. Here we focus on gradient flow dynamics for phase retrieval from random measurements. When the ratio of the number of measurements over the input dimension is small the dynamics remains trapped in spurious minima with large basins of attraction. We find analytically that above a critical ratio those critical points become unstable developing a negative direction toward the signal. By numerical experiments we show that in this regime the gradient flow algorithm is not trapped; it drifts away from the spurious critical points along the unstable direction and succeeds in finding the global minimum.
Data-driven ODE modeling of the high-frequency complex dynamics of a fluid flow
Tsutsumi, Natsuki, Nakai, Kengo, Saiki, Yoshitaka
In our previous paper [N. Tsutsumi, K. Nakai and Y. Saiki, Chaos 32, 091101 (2022)], we proposed a method for constructing a system of differential equations of chaotic behavior from only observable deterministic time series, which we call the radial function-based regression (RfR) method. However, when the targeted variable's behavior is rather complex, the direct application of the RfR method does not function well. In this study, we propose a novel method of modeling such dynamics, including the high-frequency intermittent behavior of a fluid flow, by considering another variable (base variable) showing relatively simple, less intermittent behavior. We construct an autonomous joint model composed of two parts: the first is an autonomous system of a base variable, and the other concerns the targeted variable being affected by a term involving the base variable to demonstrate complex dynamics. The constructed joint model succeeded in not only inferring a short trajectory but also reconstructing chaotic sets and statistical properties obtained from a long trajectory such as the density distributions of the actual dynamics.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Honshū > Chūgoku > Okayama Prefecture > Okayama (0.04)
- North America > United States > New York (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Physics-constrained deep learning of building thermal dynamics
Energy-efficient buildings are one of the top priorities to sustainably address the global energy demands and reduction of CO2 emissions. Advanced control strategies for buildings have been identified as a potential solution with projected energy saving potential of up to 28%. However, the main bottleneck of the model-free methods such as reinforcement learning (RL) is the sampling inefficiency and thus requirement for large datasets, which are costly to obtain or often not available in the engineering practice. On the other hand, model-based methods such as model predictive control (MPC) suffer from large cost associated with the development of the physics-based building thermal dynamics model. We address the challenge of developing cost and data-efficient predictive models of a building's thermal dynamics via physics-constrained deep learning.
Connoisseur of chaos
As a high school student in a Detroit suburb in the 1990s, Russ Tedrake did not fit the standard profile of a future computer science professor. Although he had a talent for math -- "I won some of the little math competitions," he says -- he spent his spare time playing football or soccer with friends rather than hacking code or even playing video games; in fact, he didn't get his first computer until he was a senior. He got good grades, but he didn't find the work very demanding. The only calculus class offered at his school was geared to the easier of the two Advanced Placement tests offered by the College Board -- although, against the advice of his teachers, Tedrake took the harder test anyway and did well on it. "I just sort of coasted through," he says.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)
- North America > United States > Michigan (0.09)
- North America > United States > New Mexico (0.05)
- Europe > United Kingdom > England (0.05)
ORGaNICs: A Theory of Working Memory in Brains and Machines
Heeger, David J., Mackey, Wayne E.
Working memory is a cognitive process that is responsible for temporarily holding and manipulating information. Most of the empirical neuroscience research on working memory has focused on measuring sustained activity in prefrontal cortex (PFC) and/or parietal cortex during simple delayed-response tasks, and most of the models of working memory have been based on neural integrators. But working memory means much more than just holding a piece of information online. We describe a new theory of working memory, based on a recurrent neural circuit that we call ORGaNICs (Oscillatory Recurrent GAted Neural Integrator Circuits). ORGaNICs are a variety of Long Short Term Memory units (LSTMs), imported from machine learning and artificial intelligence. ORGaNICs can be used to explain the complex dynamics of delay-period activity in prefrontal cortex (PFC) during a working memory task. The theory is analytically tractable so that we can characterize the dynamics, and the theory provides a means for reading out information from the dynamically varying responses at any point in time, in spite of the complex dynamics. ORGaNICs can be implemented with a biophysical (electrical circuit) model of pyramidal cells, combined with shunting inhibition via a thalamocortical loop. Although introduced as a computational theory of working memory, ORGaNICs are also applicable to models of sensory processing, motor preparation and motor control. ORGaNICs offer computational advantages compared to other varieties of LSTMs that are commonly used in AI applications. Consequently, ORGaNICs are a framework for canonical computation in brains and machines.